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ABSTRACT 

An important aspect of digital communications is the problem of deter¬ 
mining efficient methods for acquiring block synchronization. In this paper 
we consider a sync technique based on the recognition of successive error- 
free digits from a known sequence. 

The analysis of this technique draws from the theory of success runs. 
This theory is reviewed, and a simple recurrence relation is developed for 
computing the probability of the first occurrence of an error-free run of 
r digits in a binary sequence corrupted by noise. This relation is then ap¬ 
plied to the analysis of the sync process, which utilizes an N-digit sync 
sequence as prefix to the data blocks. The results of this study show that 
this technique is a practical method for acquiring block synchronization. 
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SECTION I 


INTRODUCTION 


In digital communications an important problem is that of acquiring block 
synchronization at the receiver. A common block synchronization technique 
consists of prefixing a known sequence of length N digits to a D-digit data word. 
This scheme utilizes a correlation detector with an error threshold to recog¬ 
nize the sync prefix in the presence of channel noise. An in-sync indication 
is given after recognition of a fixed number of correctly spaced, consecutive, 
N-digit sequences. 

In this paper we consider a different detection scheme. Instead of search¬ 
ing for N digits, allowing some errors, we search for a success run of length 
r, i. e. , an r-digit error-free segment of the N-digit sync sequence (r < N). 
The analysis of this scheme draw's from the area in probability theory related 
to recurrent events and, more specifically, to success runs. We analyze 
what we term a "run-length" synchronization technique assuming, for lack of 
a priori knowledge, that the D-digit data words are randomly generated, and 
furthermore, that errors in the N-digit sync sequence occur randomly with 
probability p . 

We begin by stating the criteria that will be used to evaluate the perform¬ 
ance of this sync technique. Then the theory of success runs is reviewed. 

A new, simplified, recurrence relation is given for the probability of the first 
occurrence of a success run, which will facilitate the calculation of the various 
performance criteria. 

The sync process is then introduced and analyzed. In addition, the results 
are presented in graphs for general applications. 
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SECTION II 


ANALYSIS AND RESULTS 

PERFORMANCE CRITERIA 

The purpose of a synchronization technique is to determine the beginning 
of a data word. In a success-run technique the encoder prefixes an N-digit 
sequence to each data word of length D, and the decoder searches for a con¬ 
secutive error-free run of r out of the N digits. When a run length of r 
digits is recognized, the system generates the remaining digits of the prefix 
and then declares word synchronization. 

The performance of this particular synchronization scheme, when digit 
errors are present in the prefixed sync pattern, is evaluated by several per¬ 
formance criteria. 

1. The probability P^ T of acquiring sync is computed given that the 
decoder is examining the N-digit prefix. 

2. The probability P^ of acquiring a (false) sync indication is com¬ 
puted given that the decoder is examining the D-digit (random) word. 

3. The probability of acquiring a true sync for a B = (D + N) digit 
block, designated by P , is lower bounded by assuming that the 
decoder examines first all D information digits and then the N 
sync digits. We upper bound this probability by assuming that the 
decoder examines first all N sync digits and then the D information 
digits. Hence 


a - p d ) 


N 


B 
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4. The usual requirement imposed on a sync scheme is that, with 

_h 

probability greater than 1 - (10 ) (ka fixed integer), the system 

must be in true block sync within seconds for a given transmis¬ 
sion rate R. In general, the probability P of acquiring sync in one 
block will not satisfy this requirement. However, the probability 
of synchronizing correctly in T blocks is 

Pt^ 1 - a- p B) T . 

and the number of blocks that can be examined in seconds at 
a data rate R is 


T o 


R 

B T 0 ‘ 


Thus, the criterion to satisfy becomes P > 1 - (10 ) with 

T ^ T , for a given k and channel error rate p. 


THEORY OF SUCCESS RUNS 


In this section we review the computation of the probability of success 
runs of length r in a sequence of Bernoulli trials. The first part of the 
discussion can be found in Feller^ and is included here for completeness. 


We are given a sequence of Bernoulli trials with p the probability that 
a trial results in a failure (a digit is in error) and q = 1 - p the probability 
of success. (The words trial and digit are used interchangeably.) We say 
that a success run of length r , in a sequence of Bernoulli trials, occurs 
at the n th trial only if the n th trial results in the r th consecutive success 
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and also adds a new run to the sequence . Let be the probability that a 
success run of length r is obtained at the n th digit, and f the probability 
that the first success run of length r occurs at the n th digit. 


r 

The probability that any r consecutive digits are correct is q . We 

look at the r digits numbered as n-r + 1, n - r + 2. n - 1, n. 

Given that these r consecutive digits are correct implies that a success run 
must occur at one of these r digits. With probability u^ ., a run occ urs 

at the (n - i) th digit, and therefore with probability u^_ .q 1 the n th trial 

also results in a success run of length r . The events, q*u ., are 

n-i 

mutually exclusive for i = 0, 1, 2 ,.... , r - 1, by the definition of a 

success run (i.e., if an r-digit success occurs at the m th trial, then 

u ,=u _=...= u , = 0). Adding these events, we obtain 

m + 1 m + 2 m + r-1 


the recurrence relation 


r- 1 

q 


u 


n-r + 1 


r- 2 

+ q 


U n-r + 2 


+ 


+ qu 

n - 


1 


r 

+ u = q 
n 


(1) 


for n 2 r . The boundary conditions are given as u^ = 1, u^ = u 9 ... 

= u =0. 

r- 1 

The solution of this recurrence relation (difference equation) is solved 
by introducing the generating function 


11 (s) 


n 

u s 
n 


n = 0 
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After some algebra we obtain 


U (s) 


, r r + 1 

1-s + pq s 

(1-s) (l-q r s r ) 


The value represents the probability of a success run at the n th trial. 

We more precisely desire the probability of the first success run at the n th 

trial (f^). The probability that a success occurs for the first time at trial 

number m and another success occurs at a later trial n >m is, by definition 

^m U n-m ‘ ^ or exam P^ e > probability that a success occurs at the n th 

trial for the first time is f = f u . Recall that u = 1. Since the u 

n n o 0 n-m 

are mutually exclusive we have 


u 

n 


f. u , + f u „ + ... f u 
ln-1 2n~2 nO 


(2) 


for n 2:1, with the boundary conditions on u^ given above. The right side 
of Equation (2) is the convolution of the sequences jf | ju.£ , which has the 
generating function y (s). jj (s), and the left side has the generating function 
'll (s) - Uq with Uq = 1. Thus the generating function for the recurrence 
times, y (s), is given by the relation 


?(s) 


= U (s) - 1 = 
U(s) 


r r 

q s (i-qs) 

l-s(l-pq r s r ) 


( 3 ) 


The general expression for f^ can be obtained by dividing through in 
the above equation. The results are given here as 


6 






f „ - - r 2 ‘- r > k 

k=0 


/n- (k+l)r\ 

/n- (k+l)r-l\ 

\ k ) 

■ n x JJ 


( 4 ) 


where ^ is the binomial coefficient such that the summation termi¬ 

nates when n- (k+l)r < 0 . The first few terms are: 


f = f =....= f = 0 
12 r- 1 


f = q 
r 


f . = pq for 1 <i <r 
r +1 n 


f 2r + i = ^ ^ 1_qr ( 1 + ( i_1 )P)] 1—i— r 


The mean and variance of f , the number of trials to the first success run 

n 

of length r , are calculated from the first and second derivatives of the gen¬ 
erating function y (s) evaluated at s = l and are given respectively by 


1-q 
M = —zr 


( 5 ) 


pq 


2 r-2 r-1 -2 

a = (pq ) - (2r + l) (pq ) - qp 


( 6 ) 
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Table I lists the values of p and a for various p and r. The cumulative 
probability 

n 

F(n;r) = ^ f. 
i = l 


is the probability that the first success run of length r occurs within n 
trials. For large n the computation of F(n;r) becomes unwieldy using 
Equation (4). Feller ^ suggests that F(n;r) can be approximated with a 

normal distribution. We show, however, that this approximation can be 
grossly inaccurate. 


Specifically, let A(n;r) be the normal approximation to F(n;r); then 


A(n;r) = (p(a) - <p(/3) 


where 


Using the expression of p and cr given above, it can be shown that the lower 
limit, (3 , has value between -1 and 0 regardless of the values of r and p. 
Hence, as n —• 00 , we have A(n;r) —• 1 - <p ([3) — 0.8413 , but we know that 
F(n;r)—1 as n —• °°. 

We now examine f and F(n;r) in detail to obtain an alternate technique 
for describing recurrence times in run-length problems. The algebraic details 
are given in the Appendix, hence only the results are presented here. 
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Table I 


Values of n and <r for the Probability of First Success 
Run of Length r 


Mean Standard Deviation 


p 

r 

M 

a 

.5 

10 

2046.0 

2037.47 

.1 

10 

18.68 

11.413 

.01 

10 

10.573 

2.074 

.001 

10 

10.055 

0.624 

.0001 

10 

10.0055 

0.196 

.5 

20 

2,097,150.0 

2,097,131.5 

.1 

20 

72.253 

57.473 

.01 

20 

22.263 

5.96 

.001 

20 

20.212 

1.712 

.0001 

20 

20.021 

0.536 

.5 

30 

2.148xl0 9 

2.147x10 

.1 

30 

225.9 

202.9 

.01 

30 

35.19 

11.39 

.001 

30 

30.47 

3.123 

.0001 

30 

30.046 

0.974 

.5 

40 

~2.2xl0 12 

~2.199x10 

.1 

40 

666.550 

634.69 

.01 

40 

49.483 

18.36 

.001 

40 

40.832 

4.803 

.0001 

40 

40.082 

1.491 
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In Equation (4), with n = mr + i for l^isr , the upper limit on the 


summation is m - 1. Thus 


f 

n 



/ r \ k 

(-pq ) 


k=0 


^n-(k+l)r^ _ q ^n- (k+l)r-lj 


(?) 


with the normal convention 
viously defined F(N;r) as 



0 for i < m - 1. We also have pre- 


F(N;r) 



( 8 ) 


where we take advantage of the boundary conditions f = f^ . f^ ^ = 0 . 

Using these last two equations and the combinatorial relation 



n-1 


I Cn-x) 

k=m-l 


(9) 


we obtain 


m-1 


F(N;r) = q 


r k 

(-pq ) 


k=0 


/n- (k+l)r+ l\ 

(li- (k+ l)r\ 

_\ k+1 / 

q V k + l )\ 


( 10 ) 
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where N = mr + i with l^i^r. Looking at Equations (7) and (10) we see 
that they have a similar form, and with a little more manipulation we obtain 

f N = [I" F (N-r-l;r)] pq r (11) 

x* 

for Nsr + l. For N = r we have f =q . Equation (11) can be interpreted 

r 

as the probability of the joint events (1) that a success run did not occur in 
the first N-r-1 digits, (2) the (N-r) th digit is in error, and (3) the 
(N-r + 1) th to the N th digits are error-free. Thus f^ , the probability 
that the first r-digit success run is obtained at the N th digit, can be deter¬ 
mined directly and in a form that can be verified intuitively. 

Using Equation (11) we also obtain the recurrence relation on f as 


n+1 



f 

n-r 


( 12 ) 


r 

for n Sr +1 with initial conditions f„= f_=...= f , = 0 , f = q and 

12 r-1 r 

x* 

f + 1 = Pq • The derivation of Equation (12) is included in the Appendix. 
This formulation is advantageous for computation on a high-speed digital 
computer. 

RUN LENGTH SYNCHRONIZATION PROCESS 

The goal of any sync scheme is to optimize the probability of acquiring 
true word sync while minimizing the false sync probability. In this context 
we now discuss the properties of a sync detection process which utilizes the 
concept of run lengths. 
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The performance of the detection process of a run-length sync technique 

depends on the sync sequence used. Recall that we have an N-digit sync 

sequence, S(N), and are looking for the first occurrence of an error-free 

run of r digits, where r^N. In general, we need a sync sequence that 

has "good" correlation properties. Specifically, we require that each of the 

r-digit segments of S(N) be distinct (there are N-r + 1 such segments). 

Then, once an r-digit run is found, the end-of-block can be uniquely ascer- 

[2] 

tained. We also want the Hamming distance 1 between any two distinct 
r-digit segments of S(N) to be as large as possible. This requirement 
minimizes the probability of false sync when S(N) is being examined. 

A natural choice which satisfies these requirements is a maximal length 

[3] k 

pseudo-random (PR) sequence 1 . Such a sequence, of length 2 -1, is 

generated by the simple implementation of a k-stage shift register in which 

the feedback connections correspond to a k-degree primitive polynomial over 

[ 2 ] 

the binary field 1 J . The properties of this cyclic sequence are: 

1. Given any k digits the next 2 -1-k digits are uniquely determined. 
Thus any segment of the sequence with length greater than or equal 
to k is distinct. 

2. The autocorrelation function is two-valued. 

3. One half of the consecutive digits of the same kind are of length 
1, 1/4 are of length 2, .... , 2 ^ ^ are of length k-1. 

Using a portion of a PR sequence for S(N), and choosing N and r such 
that k<r^N^2^-l, we not only satisfy the necessary requirements listed 
but can generate and detect the sync sequence quite easily. 

The total probability of acquiring true sync in one block depends on 
the number of chances, (N-r + 1), to find a run of r consecutive error-free 
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digits. This implies that r should be small. We must, however, choose r 
large to make the false sync probability within S(N) negligible. Thus the 
final choice of r depends on a trade-off analysis which is beyond the scope 
of this study. 

The detection process is implemented by sequentially shifting the data 
stream through a k-stage shift register which corresponds to the maximal 
length sequence being used. At each clock time the contents of the register 
are mod 2 added according to the tap positions, and this sum digit is com¬ 
pared with the next data digit which is about to enter the register. A counter 
is increased by one if these two digits are the same, otherwise the counter 
is set to zero. Thus sync is effectively declared if and only if the count 
reaches r-k. This means that the r consecutive data digits match r 
digits of the pseudo-random sequence. This detection process is based on 
the property that any k consecutive digits of the pseudo-random sequence, 
in the absence of channel errors, uniquely generate the remaining r-k digits 
(and indeed, the remaining part of the PR sequence). 

PROBABILITY OF OBTAINING TRUE SYNCHRONIZATION: P XT 

N 

A block consists of B = D + N digits. In this section we consider the 
case in which the detector is examining the N-digit sync sequence which is 
some fixed portion of a 2 - 1 digit pseudo-random sequence. The probability 
at the n th trial of obtaining sync is the probability that the first count of 
r-k is attained at that trial. The detection process can be thought of as taking 
any k digits of S(N), generating the next r-k digits off-line, then com¬ 
paring them with the corresponding r-k digits of the received sequence. 

Thus the first count of r-k is attained at the n th trial if and only if r 
consecutive digits of the received sync sequence correspond to some r-digit 
segment of S(N) . Hence, the probability of synchronizing at the n th trial 
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within the sync sequence is f , as given by Equation (7). And it follows that 
the probability of acquiring true word sync per block, , from Equation (10) 
is 


P N = F(N;r) , 


(13) 


which is the probability of obtaining the first run of r digits within N digits 
with channel error probability p . 

This function F(N;r) has been calculated for various values of p and 

r and illustrated on graphs in Figures 1, 2, and 3. Actually, the function 

-1 -2 -3 

l-F(N;r) is plotted versus N at values of p = 10 ,10 , and 10 for 

r = 10, 20, 30 and 40. 

FALSE SYNCHRONIZATION PROBABILITY PER BLOCK: P D 

When the D-digit random data sequence is being examined a (false) 
sync is declared at the first digit in which a count of r-k is reached in the 
detector. Therefore, the problem is still one of run lengths. The sort for 
relations cannot be obtained, however, simply by setting p=q = l/2 in 
Equations (7) and (10), which were derived assuming the sync sequence was 
being examined. 

In this case, the probability that an r-digit segment of the random 

-r 

sequence looks like a specific segment of the cyclic PR sequence is 2 ; 

however, any of 2 distinct r-digit segments of the PR sequence will yield 
an in-sync indication. (The all-zero segment, while not a part of the PR 
sequence, must be included as a possibility. ) Thus, the probability that any 
r-digit segment of the randomly generated data word will give an in-sync 
indication is 2 v , not 2 . In view of this, we let f be the probability 
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Within n Trials vs n for p=10 
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Figure 2. Cumulative Probability of Not Obtaining the First Success 


Within n Trials vs n for p= 10 
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Figure 3. Cumulative Probability of Not Obtaining the First Success 


Within n Trials vs n for p=10 
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that the n th trial within the random data sequence yields the first success run 

of r digits (with p = 1/2). We have f, = f. = ... f , = 0 and 

* -(r-k) 12 r-1 

f = 2 . Arguing in the manner used to interpret Equation (11) we 

* 

obtain the general expression for f as 


f* = [1 - F*(n-r- 1; r)] (1/2) (l/2) r " k (14) 


for n>r where 

n-r-1 

F*(n-r-l;r) = £ f* . (15) 

j=r 


Summing Equation (14) over D digits we obtain the cumulative probability of 
false sync per block as 

D 

P D = F*(D; r) = 2 f* n (16) 

n=r 


Equation (14) looks, at first glance, as if it describes an r-k run- 
length process with p =q = 1/2 . A closer analysis reveals that this is not 
true. However, if 2 v 'is small enough, Equation (16) can be approxi¬ 
mated by Equation (10) for a run length of r-k and q = l/2. That is, using 
the first two terms of Equation (10) as an approximation to F(D;r) we have 


F(D;r) = q r [l +p (D-r)] 


(17) 
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Similarly, for a run length of r-k we have 


p k 

F(D; r-k) = q [l + p (D - r + k)] 


( 18 ) 


or, with q = p = 1/2 , 


F(D-k; r-k) = 2 ^ k) [l + 1/2 (D-r)] 


(19) 


Now, F (D;r) can be precisely written as 


D 


♦ — p 

F (D;r) = 2 


+ k + 2— k - 1 £ ( 1 - 


F(n-r-l;r)] , (20) 


n=r+l 


or 


D-r-1 

+ -r + k - 1 -r + k- 1 \ * 

F (D;r) = 2 (D-r+ 2) - 2 > F (m;r) . (21) 

m=r 


_p k — 2^ 

Now, if 2 is small enough we can neglect the summation in 


Equation (21) to obtain 


* . -r + k- 1 

F (D;r) = 2 (D-r+ 2) , 


( 22 ) 


which is the same as Equation (19). Thus we have 


P D 1 F (D;r) = F(D-k; r-k) , 


(23) 
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which is a satisfactory approximation for most applications. The function 
F(n;r-k) with p=q = l/2 is plotted on the graphs in Figure 4 for values of 
r- k = 10, 20 and 30. 

The false sync probability in the overlap regions of random data and 
sync sequence have been ignored in this analysis. Actually, the probability' 
that the last few random digits look like an extension of the sync sequence is 
quite high. This "incoming" case could result in inadvertently finding a 
correct sync indication. In the "outgoing" case, where the first few random 
digits of the next data word could look like an extension of the sync sequence, 
the effects would be sufficiently detrimental to warrant the use of added 
protective measures. In practice, an end-of-sync-sequence counter could 
be used to count the digits between a detected run length and the end of the 
sync sequence. If the count reaches N-r + 1 or greater, the sync indication 
is ignored, thus eliminating this false sync possibility. 

PROBABILITY OF SYNCHRONIZING WITHIN A GIVEN TIME 

The usual requirement for any sync technique is that it must be able to 



attain an overall probability of synchronization greater than 


k 


a fixed integer. One measure of performance for a sync technique is the time 
it takes to ensure this probability. Moreover, relative merits of various 
sync schemes can be determined by comparing their probability of synchron¬ 
izing in a given time interval. This comparison obviously should be made 
using similar data rates and block lengths. 

For the scheme presented in this paper, the probability P of acquiring 

J3 

true sync in a block of B digits can be bounded as 



( 24 ) 
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Figure 4. Cumulative Probability of Obtaining the First Success 
Within n Trials vs n for p = q= 1/2 
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where the expression (1- P^)P^ indicates that the detector examines the 
total D-digit random word first and then the N-digit sync sequence. 

For a given B, N, r, p, and k the probability P will not in general 

B 

meet the requirement that P >1- 10“ ^ and hence on the average more than 
one block will have to be examined. In T blocks the probability of correct 
sync is given by 


P T " 1 - < 1 - P B )T ' 


(25) 


The number of blocks that can be examined in t seconds for a block 
of B digits and a data rate of It digits per second is (Rt/B . Let be 


the minimum number of complete blocks needed for P 


1-10 


to be 


satisfied. In other words, is the smallest integer such that 


i - (i-P B ) ° - i - io~ k 


Therefore, T^ must satisfy the inequality, 


T o 


k 

1ob< ^ 


(26) 


and is the smallest integer greater than or equal to the right side. The time 

it takes to examine this number of blocks is T^B/R . Thus if is the 

— k 

maximum allowable time to achieve a probability greater than 1-10 , 

the system must satisfy T^B/R < . 
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_ 1 ^ 

Figure 5 is a plot of T versus P for values of 1-10 of 0.9, 

0 B 

0.99 and 0.999 (k=l, 2, and 3 respectively). Once the probability P of 

B 

acquiring true sync per block is calculated, this graph allows the determination 
of the minimum number of blocks needed to be examined to ensure a total 
probability greater than 1- (10 ) . 
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Figure 5. Plot of Minimum Number of Blocks to Guarantee a Probability 

-k 

of Acquiring Sync Greater than 1 - (10 ) vs the Probability 

of Synchronizing Per Block 
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SECTION in 


CONCLUSIONS 

We have analyzed a synchronization technique that searches for an 
r-digit error-free run within an N-digit sync sequence which is prefixed to 
the data words. The N-digit sequence is chosen to be a portion of a cyclic 
pseudo-random sequence of length 2 -1 digits. A simple implementation 
has been described which employs a k-stage shift register as detector (k<r). 
This detector examines the data stream digit by digit, while indexing a 
counter, enabling a search for a pattern much longer than the register length. 
Most commonly used sync techniques require examination of entire sync 
sequence lengths at one time. 

In order to evaluate this run-length technique, expressions for several 
performance criteria have been derived. These criteria include the proba¬ 
bility of acquiring true sync when the detector is examining the N-digit sync 
sequence, and also the probability of a false sync given that the detector is 
examining the data word. Graphs are provided which illustrate these expres¬ 
sions for parameter values that would be chosen for practical schemes. An 
interesting property of the analyzed sync scheme is that the probability of 
false sync increases linearly with the length of the data word. In sync 
schemes that allow up to e errors in the sync prefix, the probability of falsely 
synchronizing within the data word increases as where n is the 
length of the data word. 

A more general result of this research which is applicable to problems 
in other areas is the derivation of a new recurrence relation for the proba¬ 
bility of the first success run in a sequence of Bernoulli trials. The formu¬ 
lation presented here is advantageous for computation on a high-speed digital 
computer. 
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APPENDIX 


Given the generating function 


? (s) 


q r s r (1-qs) 

1 - s(l-pq r s r ) 


OO 



f 

n 


n 


s 


n=0 


we derive the general expression for 


f . 
n 


Let 


?(s) = q r s r (1-qs) G(s) 


where 


G(s) 


_ 1 _ 

1 - s (l-q r s r p) 


n 

\ /i r r. n 
^ (1 - pq s ) s 

n=0 


is expressed as a geometric series. 


Now 


r r 

(1 - pq s ) 


is given as 


r r n \ n . , r r 1 
(1-pq s ) = ) ( ^ (-pq s ) 

i=0 


Therefore 
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-l I <")<-^ r > 


n~0 i=0 
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oo 


\ n r, ,n. . r r ,n . 2r . , v n r v ° nrl 

= Z S 1-( ! )(pq)s + (o)(w) s ...+(-i) (pq) s 

n=0 


OO 



n=0 


The coefficient g^ is obtained by collecting terms of like power in the above 
expression. That is, 


n n I" n-r 

*n s ' s [_ < i ’ 


pq 


+ r 2r ) 

2 


r ,n-mr. r 

(pq )••• + ( m ) (-pq ) 


m 


where m is the greatest integer less than or equal to n/(r + 1). Using 
this summation for G(s) in the expression for ^(s) , we obtain 


oo 


r r 

C S (s) = q s 


/I x n 

S n (1-qs) s 


n=0 

r r T 2 

= q s g Q + s(g 1 - qg Q ) + s (g 2 - qg 1 ) + 

with g^ = 1. It is apparent that f = f ^ = ... f = 0 , and if we define 
g i = 0 we can write 
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OO 


5<s> 


= I “X-r - “Vr-l* 8 " ' 
n=r 


Thus 


f 

n 


g. 


n- r 


«n-r-l] 


for n ^ r. Using the above expression for g we have 
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f 
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■I* 

k=0 


r. 

pq ) 


/n- (k +1) r' 

\ „ A-(k+l)r-l\ 

5 k * 

' ' q \ k (J 


where the upper limit on the summation m is the greatest integer less than 
, . n-r 

or equal to-r . 

M r+ 1 


* * * * * 


Now Equation (10) and the simple expression 


f 


N 


r 

pq 


1 - F (n- r- 1; r) 


for N > r are derived from the previous results. 
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F ,N;r, = ^ f n 
n=r 


N 


= q 


Y V / /n- (k + l)r\ /n- (k+1)r-1\ 

Z Z <- pq > \ k ) - n k ) 


n=r k=0 


We interchange the order of summation and look at the terms 


N 

y ^n- (k+ l)r 
n=r 


and l ( n ' (k ; l,r - 1 ) 
n=r 


Starting with the well-known combinatorial relation 


l) - C\) + ("») 


we repeatedly use this expansion on the second term of the previous iteration 
as 


(») ■ o + o + c; 2 ) 


/n - l\ 


?) 
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) * r 3 ) 

\m - 1/ 

I \m - 

1/ 

\m - 1 
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and so on, to obtain the relation 



l (J-J 

j=m-l 


Using this relation with the expression 


N 



n- (k+ 1) r 
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n=r 


N 


n=(k+l)r+k 


/n- (k+1)r\ 

Ik /' 1 


we make a change of variable to obtain 


N 

^n-(k+l)r 

n=r 


N- (k+l)r 

I C) 
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(k+1) r +1 
k +1 


Similarly, we have 
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/n- (k+ l)r- 


x ) 


n=r 


/N- (k + l)r\ 

V k+i / 


Finally, we use these last results in the expression for F(N;r) above to 
obtain Equation (10) as 


F(N;r) = q 


(~pq r ) 


/ N - (k+ 1) r + 1\ 


} k+1 / 

q V k+1 )\ 


k=0 
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for N s r . Letting N = mr + i with m 2: 0 and l^i^r, the upper limit on 
this summation is m - 1. 

Now, in the preceding equation let x = k + 1. Then 


F(N;r) 


r \ 

x- 1 

, i\ 

/N - xr +1\ 

/N - xr\ 

q / 

(-pq ) 
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Multiplying both sides by (-pq ) we have 


x 


m 

-pq r F(N;r) - q 1 * ^ (-pq 1 ”) 

x=l 


/N- xr +1\ 

/N- xr\ 

V ; - q 

x x / 

\ X / 


m 
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This summation is the expression for 
Thus, 


f . with the x = 0 term missing. 

N+r + 1 


-pq r F(N;r) 


f 


N + r + 1 


r 

-q p 


where pq is the zero order term. Rearranging, we have 
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or 
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pq 


1 - F(N - r - 1; r) 
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The recurrence relation 


n+1 


= f - pq 
n 


n- r 


which can be taken as the defining equation for the probability of obtaining the 
first run of r successes at the n th trial, is derived from the previous 
result. That is, 
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Subtracting the second equation from the first we have 
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But the term in brackets is just f . Hence, 
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